Computer Organization


Q231.

The floating point unit of a processor using a design D takes 2t cycles compared to t cycles taken by the fixed point unit. There are two more design suggestions D_1 and D_2. D_1 uses 30% more cycles for fixed point unit but 30% less cycles for floating point unit as compared to design D. D_2 uses 40% less cycles for fixed point unit but 10% more cycles for floating point unit as compared to design D. For a given program which has 80% fixed point operations and 20% floating point operations, which of the following ordering reflects the relative performances of three designs? (D_i > D_j denotes that D_i is faster than D_j)
GateOverflow

Q232.

A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement X = (S - R * (P + Q))/T is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence. \begin{array}{lll} \text{ADD} & \text{$R5,R0,R1$} & \text{$;R5$} \leftarrow \text{R0 + R1} \\ \text{MUL}& \text{$R6,R2,R5$} & \text{$;R6$} \leftarrow \text{R2 * R5} \\ \text{SUB} & \text{$R5,R3,R6$} & \text{$;R5$} \leftarrow \text{R3 -R6} \\ \text{DIV} &\text{$R6,R5,R4$} & \text{$;R6$} \leftarrow \text{R5/R4} \\ \text{STORE} &\text{$R6,X$}& \text{$;X$} \leftarrow \text{R6} \\ \end{array} The number of Read-After-Write (RAW) dependencies, Write-After-Read( WAR) dependencies, and Write-After-Write (WAW) dependencies in the sequence of instructions are, respectively,
GateOverflow

Q233.

A 5-stage pipelined processor has Instruction Fetch (IF), Instruction Decode (ID), Operand Fetch (OF), Perform Operation (PO) and Write Operand (WO) stages. The IF, ID, OF and WO stages take 1 clock cycle each for any instruction. The PO stage takes 1 clock cycle for ADD and SUB instructions, 3 clock cycles for MUL instruction, and 6 clock cycles for DIV instruction respectively. Operand forwarding is used in the pipeline. What is the number of clock cycles needed to execute the following sequence of instructions?
GateOverflow

Q234.

Register renaming is done in pipelined processors
GateOverflow

Q235.

A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement X = (S - R * (P + Q))/T is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence. \begin{array}{lll} \text{ADD} & \text{$R5,R0,R1$} & \text{$;R5$} \leftarrow \text{R0 + R1} \\ \text{MUL}& \text{$R6,R2,R5$} & \text{$;R6$} \leftarrow \text{R2 * R5} \\ \text{SUB} & \text{$R5,R3,R6$} & \text{$;R5$} \leftarrow \text{R3 -R6} \\ \text{DIV} &\text{$R6,R5,R4$} & \text{$;R6$} \leftarrow \text{R5/R4} \\ \text{STORE} &\text{$R6,X$}& \text{$;X$} \leftarrow \text{R6} \\ \end{array} The IF, ID and WB stages take 1 clock cycle each. The EX stage takes 1 clock cycle each for the ADD, SUB and STORE operations, and 3 clock cycles each for MUL and DIV operations. Operand forwarding from the EX stage to the ID stage is used. The number of clock cycles required to complete the sequence of instructions is
GateOverflow

Q236.

Consider a 3GHz (gigahertz) processor with a three-stage pipeline and stage latencies \tau _{1}, \tau _{2}, and \tau _{3} such that \tau _{1}=3\tau _{2}/4=2\tau _{3}. If the longest pipeline stage is split into two pipeline stages of equal latency, the new frequency is _____GHz, ignoring delays in the pipeline registers.
GateOverflow

Q237.

Consider two processors P1 and P2 executing the same instruction set. Assume that under identical conditions, for the same input, a program running on P2 takes 25% less time but incurs 20% more CPI (clock cycles per instruction) as compared to the program running on P1. If the clock frequency of P1 is 1GHz, then the clock frequency of P2 (in GHz) is _________.
GateOverflow

Q238.

Which of the following are NOT true in a pipelined processor? I. Bypassing can handle all RAW hazards II. Register renaming can eliminate all register carried WAR hazards III. Control hazard penalties can be eliminated by dynamic branch prediction
GateOverflow

Q239.

A processor takes 12 cycles to complete an instruction I. The corresponding pipelined processor uses 6 stages with the execution times of 3,2,5,4,6 and 2 cycles respectively. What is the asymptotic speedup assuming that a very large number of instructions are to be executed?
GateOverflow

Q240.

Consider a 6-stage instruction pipeline, where all stages are perfectly balanced. Assume that there is no cycle-time overhead of pipelining. When an application is executing on this 6-stage pipeline, the speedup achieved with respect to non-pipelined execution if 25% of the instructions incur 2 pipeline stall cycles is _________.
GateOverflow